{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Fashion MNIST Image Classification with TensorFlow on Cloud ML Engine\n",
    "\n",
    "This notebook demonstrates how to train a deep neural network model for image classification and deploy it as an Application Programming Interface (API) (i.e. web service) for online predictions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "PROJECT = 'my-project-id' # REPLACE WITH YOUR PROJECT ID\n",
    "BUCKET = 'my-bucket-name' # REPLACE WITH YOUR BUCKET NAME\n",
    "REGION = 'us-central1' # REPLACE WITH YOUR BUCKET REGION e.g. us-central1\n",
    "MODEL_TYPE='dnn'  #  'dnn' or 'cnn'\n",
    "\n",
    "# do not change these\n",
    "os.environ['PROJECT'] = PROJECT\n",
    "os.environ['BUCKET'] = BUCKET\n",
    "os.environ['REGION'] = REGION\n",
    "os.environ['MODEL_TYPE'] = MODEL_TYPE\n",
    "os.environ['TFVERSION'] = '1.8'  # Tensorflow version"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%bash\n",
    "gcloud config set project $PROJECT\n",
    "gcloud config set compute/region $REGION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Run as a Python module\n",
    "\n",
    "Now since we want to run our code on Cloud ML Engine, we've packaged it as a Python module.\n",
    "\n",
    "The `model.py` and `task.py` files containing the model code are in <a href=\"fashionmodel/trainer\">fashionmodel/trainer</a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%bash\n",
    "rm -rf fashionmodel.tar.gz fashion_trained\n",
    "gcloud ml-engine local train \\\n",
    "   --module-name=trainer.task \\\n",
    "   --package-path=${PWD}/fashionmodel/trainer \\\n",
    "   -- \\\n",
    "   --output_dir=${PWD}/fashion_trained \\\n",
    "   --train_steps=1000 \\\n",
    "   --learning_rate=0.01 \\\n",
    "   --train_batch_size=512 \\\n",
    "   --model=$MODEL_TYPE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Training should finish in just a few seconds because it ran outside of the Jupyter's runtime, as an independent process of the node's operating system. You can explore the training metrics using TensorBoard."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from google.datalab.ml import TensorBoard\n",
    "TensorBoard().start('fashion_trained'.format(BUCKET, MODEL_TYPE))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for pid in TensorBoard.list()['pid']:\n",
    "  TensorBoard().stop(pid)\n",
    "  print 'Stopped TensorBoard with pid {}'.format(pid)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Next, let's use Cloud ML Engine so we can train on GPU:** `--scale-tier=BASIC_GPU` **and deploy the model as a web service API**\n",
    "\n",
    "Note that GPU speed up depends on the model type. You'll notice that more complex models train substantially faster on GPUs. When you are working with simple models that take just seconds to minutes to train on a single node, keep in mind that Cloud ML Engine introduces a few minutes of overhead for training job setup & teardown."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%bash\n",
    "OUTDIR=gs://${BUCKET}/fashion/trained_${MODEL_TYPE}\n",
    "JOBNAME=fashion_${MODEL_TYPE}_$(date -u +%y%m%d_%H%M%S)\n",
    "echo $OUTDIR $REGION $JOBNAME\n",
    "gcloud storage rm --recursive --continue-on-error $OUTDIR\n",
    "gcloud ml-engine jobs submit training $JOBNAME \\\n",
    "   --region=$REGION \\\n",
    "   --module-name=trainer.task \\\n",
    "   --package-path=${PWD}/fashionmodel/trainer \\\n",
    "   --job-dir=$OUTDIR \\\n",
    "   --staging-bucket=gs://$BUCKET \\\n",
    "   --scale-tier=BASIC_GPU \\\n",
    "   --runtime-version=$TFVERSION \\\n",
    "   -- \\\n",
    "   --output_dir=$OUTDIR \\\n",
    "   --train_steps=1000 --learning_rate=0.01 --train_batch_size=512 \\\n",
    "   --model=$MODEL_TYPE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Monitoring training with TensorBoard\n",
    "\n",
    "Use this cell to launch tensorboard"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from google.datalab.ml import TensorBoard\n",
    "TensorBoard().start('gs://{}/fashion/trained_{}'.format(BUCKET, MODEL_TYPE))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for pid in TensorBoard.list()['pid']:\n",
    "  TensorBoard().stop(pid)\n",
    "  print 'Stopped TensorBoard with pid {}'.format(pid)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Deploying and predicting with model\n",
    "\n",
    "Deploy the model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%bash\n",
    "MODEL_NAME=\"fashion\"\n",
    "MODEL_VERSION=${MODEL_TYPE}\n",
    "MODEL_LOCATION=$(gcloud storage ls gs://${BUCKET}/fashion/trained_${MODEL_TYPE}/export/exporter | tail -1)\n",
    "echo \"Deleting and deploying $MODEL_NAME $MODEL_VERSION from $MODEL_LOCATION ... this will take a few minutes\"\n",
    "#gcloud ml-engine versions delete ${MODEL_VERSION} --model ${MODEL_NAME}\n",
    "#gcloud ml-engine models delete ${MODEL_NAME}\n",
    "gcloud ml-engine models create ${MODEL_NAME} --regions $REGION\n",
    "gcloud ml-engine versions create ${MODEL_VERSION} --model ${MODEL_NAME} --origin ${MODEL_LOCATION} --runtime-version=$TFVERSION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The previous step of deploying the model can take a few minutes. If it is successful, you should see an output similar to this one:\n",
    "\n",
    "<pre>\n",
    "Created ml engine model [projects/qwiklabs-gcp-27eb45524d98e9a5/models/fashion].\n",
    "Creating version (this might take a few minutes)......\n",
    "...................................................................................................................done.\n",
    "</pre>\n",
    "\n",
    "Next, download a local copy of the Fashion MNIST dataset to use with Cloud ML Engine for predictions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.fashion_mnist.load_data()\n",
    "LABELS = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To predict with the model, save one of the test images as a JavaScript Object Notation (JSON) file. Also, take a look at it as a graphic and notice the expected class value in the title."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "HEIGHT=28\n",
    "WIDTH=28\n",
    "\n",
    "IMGNO=12 #CHANGE THIS to get different images\n",
    "\n",
    "#Convert raw image data to a test.json file and persist it to disk\n",
    "import json, codecs\n",
    "jsondata = {'image': test_images[IMGNO].reshape(HEIGHT, WIDTH).tolist()}\n",
    "json.dump(jsondata, codecs.open('test.json', 'w', encoding='utf-8'))\n",
    "\n",
    "#Take a look at a sample image and the correct label from the test dataset\n",
    "import matplotlib.pyplot as plt\n",
    "plt.imshow(test_images[IMGNO].reshape(HEIGHT, WIDTH))\n",
    "title = plt.title('{} / Class #{}'.format(LABELS[test_labels[IMGNO]], test_labels[IMGNO]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here's how the same image looks when it saved in the test.json file for use with the prediction API."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%bash\n",
    "cat test.json"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Send the file to the prediction service and check whether the model you trained returns the correct prediction."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%bash\n",
    "gcloud ml-engine predict \\\n",
    "   --model=fashion \\\n",
    "   --version=${MODEL_TYPE} \\\n",
    "   --json-instances=./test.json"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here is what my prediction service returned based on a model that I trained with Cloud ML Engine. Notice that the model predicts a probability of roughly 0.87 for the sneaker (class #7).\n",
    "\n",
    "<pre>\n",
    "CLASSES  PROBABILITIES\n",
    "7        [2.627301398661075e-08, 1.843199037843135e-09, 6.106044111220399e-06, 5.581538289334276e-07, 9.015485602503759e-08, 0.09837665408849716, 3.8953788816797896e-07, 0.8761420249938965, 0.0034074552822858095, 0.02206665836274624]\n",
    "</pre>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<pre>\n",
    "# Copyright 2017 Google Inc. All Rights Reserved.\n",
    "#\n",
    "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
    "# you may not use this file except in compliance with the License.\n",
    "# You may obtain a copy of the License at\n",
    "#\n",
    "#      http://www.apache.org/licenses/LICENSE-2.0\n",
    "#\n",
    "# Unless required by applicable law or agreed to in writing, software\n",
    "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
    "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
    "# See the License for the specific language governing permissions and\n",
    "# limitations under the License.\n",
    "</pre>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.15"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
